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Abstract!  In  the  Vhirlvind  X computer,  oooatruatad  at  KIT  under  Office  of 
NaybI  Raaaarch  eponoorsuip  and  prasantly  operated  under  Joint 
Servlcse  support,  it  has  been  found  that  earglbel  checking 
Yostljr  reduoaci  the  lULdiiae  failure  rats,  k aeriee  of  teat 
progrtas  a&oh  of  which  therougUjr  ajcarsieea  a different  aeetion 
of  ^e  naohiaa  is  used  in  the  nirginal  ohseking  procedure. 

Marginal  oheeklng  oannot  prersnt  Intsmlttant  and  total  failures 
oauaed  hjr  ehorta  and  opane.  Shaaa  era  isolated  by  laathode  eom- 
bialng  built-in  diaoklng  featured,  diagnostic  progreaedng,  signal 
traei^,  and  operator  sf^erlanes  and  ingenuity.  Thaae  nethods 
ara  graatly  fasiliwatsd  by  a ipaoial  progran  oontrol  etoiw  allowa 
a partedioalljr  rapaatod  test  pcrogras  to  be  atopsed  at  cm  arbi- 
trary point  to  studs  liiloator  li^ts  and  sigr^  wsTsforeB. 
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1.0  I.NraODUCTION 


Through  four  jears  of  orperience  in  maintaining  the  Whirlwind  I 
computer » several  inprovenents  In  trouble  location  techniques  over  those 
originall7  conceived  have  been  worked  out.  This  experience  provided 
knowledge  of  vrtiat  types  of  failures  must  be  dealt  with,  what  procedures 
are  most  effective,  and  what  special  features  are  helpful  to  an  operator 
In  localising  trouble.  The  Whirlwind  con?niter  was  constructed  at  MIT 
\inder  sponsorship  of  the  Office  of  Naval  Research  and  is  presently  op- 
erated under  support  of  the  Joint  Services.  I will  first  discuss  briefly 
the  types  of  faults  irtiich  are  encountered,  then  will  outline  basic  phil- 
osophies of  failure  diagnosis  which  arc  peculiar  to  the  machine.  Next 
I will  describe  facilities  provided  to  aid  an  operator  in  his  diagnoses, 
and  finally  %ri.ll  illustrate  the  actual  procedures  which  are  in  use. 


2.0  ?'AULTS  TO  BE  DIAGNOSED 

Faults  in  the  computer  syst6.ui  a*e  classified  into  foxir  cate- 
gories. Three  of  these  are  well  known  and  typical  of  any  electronic 
equipment.  They  are  (l)  graduaO.  deterioration,  (2)  sudden  failures 
such  as  shorts  or  opens,  and  (3)  intermittent  or  transient  failures, 

The  fourth  category  is  peculiar  to  an  e^qserimental  machine  in  which 
modification  and  expansion  is  being  carried  out.  Since  the  central 
portion  of  the  computer  became  operative,  there  has  been  a continuing 
program  to  expand  the  internal  storage  capacity  and  the  terminal  equip- 
ment facilities.  Because  of  this  work,  it  is  necessary  to  contend  with 
faults  that  are  the  result  of  nalad^’iStment  and  weaknesses  in  newly- 
installed  equipment.  These  then  form  the  fourth  category. 

With  the  procedures  ^rtiich  have  been  worked  out  in  Whirlwind  I, 
it  has  been  found  that  the  faults  i^ich  can  be  located  most  easily  are 
sudden  complete  failures.  Gradual  deterioration  and  defects  associated 
with  newly-installed  equipmon'i;  also  are  relatively  easy  to  find.  Inter- 
mittent failures,  however,  are  difficult  to  deal  with  and  therefore  are 
considered  the  most  serious. 


3.0  PHILOSOPHY  OF  FAILURE  DIAGNOSIS 


It  is  to  be  expected  that  the  trouble  location  methods  used 
in  a computer  reflect  its  logloal  design.  In  Whirlwind,  these  trouble 
location  methods  also  reflect  the  mechanical  arrangement  of  the  system. 
When  the  Whirlwind  computer  i?as  being  planned,  it  was  felt  that  panels 
should  be  constipated  .^o  that  all  component  connections  would  be  readily 
accessible  tdiile  the  system  was  in  operation.  This  would  facilitate 
signal  tracing  with  video  probes  while  the  system  was  first  being  checked 
out,  and  would  side-step  many  packaging  problems.  With  this  extremely 
open  typo  of  construction,  it  has  bean  found  more  practical  to  repair 
circuits  in  place  rather  than  to  substitute  spare  panels.  Obviously 
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this  BiaJces  trouble  location  procedures  more  complex.  Faults  must  be 
co^^jlotely  Isolated  rather  than  asralj  localised  to  a given  panel  or 
chassis.  A strong  forgunent  in  favor  of  such  an  arrangement  Is  that 
the  coB^Tuter  can  be  used  as  a powerful  testing  derrlce.  Bench  testing, 
with  neceesarily  United  facilities  for  signal  generation  and  detection, 
soBotioes  may  not  show  up  all  the  malfunctions  in  a circuit. 

Another  mechanical  design  feature  la  reflected  in  the  trouble- 
location  methods  now  ?a^>loyed.  It  is  the  layout  of  the  conq?uter*8 
control  center  which  consists  of  a flexible  arrangement  of  panels  in 
standard  racks  rather  than  a relatively  fixed  operating  console.  This 
has  encouraged  the  installation  of  special  machino  controls  and  special 
facilities  for  monitoring  critical  signals  for  testing  purposes.  Of 
particular  value  is  some  equipment  which  can  be  used  to  change  the 
over-all  logic  of  the  machine  control.  I will  describe  this  later  in 
ay  talk. 


Am  a ilnal  point  on  the  philosophy  of  failure  diagnosis  in 
Whirlwind,  considerable  ea^hasis  is  placed  on  marginal  checking.  % 
discovering  detei'lorating  circuits  before  thqy  cause  trouble  the  number 
of  interrupting  failures  can  be  kept  low.  fhe  poeeibility  of  a deter- 
iorating component  causing  intermittent  failures,  the  type  most  difficult 
to  isolate,  ie  virtually  elimlaated. 


EQUIPMENT  AIDS  IN  IP.g'HLS  L0CATI0.N 


X have  just  described  the  types  of  faults  to  be  diagnosed  and 
some  special  characteristics  of  the  computer  which  have  influenced  the 
choice  of  trouble  location  methods  useo.  Now  a brief  discussion  of  the 
equipment  provided  to  aid  in  trouble  diagnosis  will  complete  the  back- 
ground needed  for  an  explanation  of  the  actual  checking  procedures  usei. 

ii.l  Built-In  Alarms 


An  importfjit  aid  to  the  operator.  In  fact  the  one  around  which 
nearly  all  of  the  trouble  location  procedures  are  centered,  is  a system 
of  built-in  altures.  Tnere  are  a total  of  eig^t  different  alarm  indi- 
cations. Azy  one  of  these  will  stop  the  conputer  operation  when  the 
alarm  occurs.  These  el^t  alarms  are  evenly  distributed  among  the  foiir 
main  subdivisioos  cf  the  computer,  the  central  control,  the  arithmetic 
element,  the  internal  memory,  and  the  Input-output  demont.  Cenerally 
speaking,  they  are  designed  to  monitor  the  operation  of  critical  control 
circuits  or  to  aiiow  up  certain  cases  of  nonpermlssible  prograranlng.  One 
of  the  alarms,  a transfer  check,  is  applied  more  frequently  than  the 
others  and  covers  all  sections  of  the  computer.  It  (^ecks  that  words 
transferred  between  registers  by  means  of  the  common  bus  system  are 
correctly  received.  Ihe  check  is  accomplished  by  a special  register 
which  receives  the  word  by  two  different  paths,  one  directly  from  the 
main  bus  and  the  second  fnim  the  receiving  registor  via  a second  check 
bus. 
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The  special  identity  checking  facilities  of  the  transfer  check 
have  been  used  for  implementing  an  identity  check  order  as  a part  of  the 
standard  order  code.  Tihis  order  jnakes  it  possible  for  a programmer  to 
arbitrarily  command  a dieck  on  the  contents  of  the  accumulator  against 
a V 0 d stored  in  the  memory.  Such  an  order  obviously  is  valuable  in 
trouble  location  and  diagnostic  programming  work. 

U.2  >>arginal  Checking  Equipment 

For  locat:*aig  gradual  deterioration^  the  marginal  checking 
system  is  the  principal  tool.  Marginal  checking  consists  of  variation 
of  certain  d-c  supply  voltages  to  the  tubes  rather  than  variation  of 
hsatar  voltages.  The  circuits  for  marginal  checking  are  an  integral 
part  of  the  power  distribution  system  and  are  so  desi  gned  that  voltage 
variation  can  take  place  in  only  a small  section  of  the  computer  at  a 
time.  Ihe  whole  computer  is  divided  into  about  two  hundred  such  sections. 
These  may  be  chosen  manually  or  in  an  automatic  sequence  during  margiDal 
checking  procedures.  Insofar  as  possible  the  sectionalizaticn  ves  done 
so  that  logically  dependent  parts  of  the  computer  are  on  different  voltage 
vai'iatlon  circuits.  This  combines  a powerful  trouble  location  feature 
ffith  the  ability  to  determine  whether  the  system  performance  is  deter- 


Cyclic  Pregram  Control 

Sudueii  failures  and  certain  types  of  intermittent  failures 
require  a diagnostic  approach  different  from  that  for  deteriorating 
components.  To  assist  in  a detailed  aiuQysis  of  euch  troubles,  a special 
computer  control  feature,  called  a cyclic  program  control,  has  been  pro- 
vided. Basically  t.bs  cyclic  program  control  permits  a change  in  machine 
logic.  It  makes  it  iposslble  to  interpret  flip-flop  indicators  and  signal 
waveforms  while  preserving  normal  high-speed  operation  of  the  partlrolar 
program  giving  the  trouble.  This  control  embodies  mechanisms  to  stop 
the  computer  at  ary  step  in  the  program  and  then  to  restart  it  at  the 
beginning  of  the  program.  Since  the  number  of  orders  executed  may  be 
adjusted  by  simply  varying  a dday,  the  flow  of  information  from  one 
re^^t  r to  another  can  be  observed  visually  on  an  oscilloscope.  Fur- 
thermore, the  restart  is  someidiat  delayed  following  a stop  so  this  same 
flow  of  information  between  registera  can  be  observed  on  flip-flop 
indicator  lights  grouped  at  the  central  control  loeation.  In  general, 
the  cyclic  program  control  permits  an  operator  to  set  up  co^^)llcated 
conditions  tfithln  the  computer  identical  or  equivalent  to  those  of 
normal  operation  and  at  the  same  time  obtai  an  outward  simplicity 
that  makes  analysis  relatively  easy.' 

!i.!;  Records  of  Intermi  ttent  Failures 


For  Intermlttont  type  failures,  little  specialized  equipment 
Is  available  to  assist  in  trouble  location.  Two  features  are  worthy 
of  mention.  First,  a camera  has  been  set  up  so  that  the  control  panels 
can  be  photographed  to  show  all  flip-flop  li^t  indications  and  all 


• f . } I 


c- 


Page  5 


control  switch  settings  at  the  time  of  an  error.  This  makes  It  practical 
to  preserve  data  on  all  such  errors  without  seriously  delaying  applica- 
tions work.  The  photograph  Is  supplemented  t?y  a report  giving  other 
details  concerxilug  the  program  and  isethod  of  using  the  coioputer  that 
nd^t  be  helpf>al  in  later  study  of  the  failure. 

Since  many  intermittent  failuies  are  the  result  of  poor 
connections  on  panels  or  momentary  shorts  within  tubes,  they  can  be 
precipitated  by  shock  or  vibration-  A second  feature  whii’i  helps  in 
localizing  intermitteut  trouble  is  an  arrangement  for  prodv icing  throu^.out 
the  computer  room  an  audible  signal  diaractarlstic  of  the  {.rogram  being 
nm.  As  tu1>e3  or  panels  are  being  tapped  an  intermittent  fault  is  indi- 
cated Igr  an  interruption  of  this  signal,  after  which  the  program  automati- 
cally restarts. 


5.0  TROOBLE  LOCATION  PROCEDURES 

A more  comprehensive  picture  of  the  built-in  aids  Just  described 
can  be  obtained  from  a description  of  the  diagnostic  procedures  used.  I 
will  first  discuss  marginal  diecking  and  then  will  illustrate  methods  of 
locatir-g  sudden  and  intermitteDt  faults. 

5.1  Marginal  Checking 

Checking  for  lov  operating  margins  is  a dally  preventive  main- 
tenance procedure.  For  the  ccaoiete  routine » several  different  programs 
are  used  each  designed  to  thorou^ily  exercise  a different  portion  of  the 
computer.  The  principal  followed  is  that  when  one  portion  has  passed  a 
test  satisfactorily  it  may  then  safely  be  used  in  checking  another  part 
of  the  eoBQTuter.  For  example,  a test  is  first  made  of  the  central 
control  using  a oinlDUD  of  storage,  arithmetic  element,  and  input-output 
facilities.  Next  Is  a thorough  test  of  the  arithmetic  element,  followed 
by  tests  of  storage,  and  finally  of  the  input-oueput  element.  The  pro- 
grams for  these  tests  are  designed  with  as  many  check  orders  as  possible 
so  that  no  more  than  a few  orders  can  be  executed  after  any  error  before 
the  counter  13  stopped  by  an  alarm. 

Typical  operating  procedure  for  testing  a section  of  the  com- 
puter is  as  follows.  The  marginal  checking  equipment  is  set  for  an 
automatic  mode  in  which  it  selects  voltage  variation  lines  in  sequence 
and  applies  a voltage  excursion  to  each.  The  magnitude  of  the  voltage 
excursion  1;  preset  for  eadi  line  and  therefore  may  differ  from  one  line 
to  the  next.  Ihe  preset  values  lure  those  that  give  excursions  10  percent 
less  than  the  m.axima  the  drculte  can  tolerate  without  failing.  With 
such  settings  anaitomatic  marginal  checking  sequence  will  cause  no 
failures  until  the  margin  on  a circuit  has  dropped  by  more  than  10 
percent.  If  deterioration  of  some  eonqwnent  causes  the  margin  for  a 
line  drop  more  than  10  percent,  during  automatic  marginal  checking 
an  alarm  will  occur  which  stops  the  equipment  and  piermits  manual 
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determination  of  the  new  failure  point.  The  excursion  is  then  reset 
to  10  percent  below  this  new  value  and  the  new  excursion  is  entered  on 
a record  sheet  for  this  line.  In  this  manner,  the  only  data  vd^iich  need 
be  recorded  during  the  routine  checkiag  are  tiiose  on  the  few  lines  which 
have  deteriorated  appreciably.  Unless  there  has  been  an  abnormally  large 
drop  in  margin,  no  corrective  action  is  taken  during  the  marginal  checking 
period.  Instead,  a weekly  maintenance  period  is  scheduled  during  which 
circuits  whose  margins  are  approaching  a dangerously  low  value  are  inves- 
tigated and  repaired. 

As  was  pointed  out  at  the  beginning  of  ny  talk,  one  type  of 
fault  that  must  be  dealt  with  in  the  Whirlwind  machine  is  maladjustment 
or  other  weaknesses  in  the  system  resulting  from  installation  of  new 
equipment.  Abnormally  large  changes  in  margins  detected  during  a routine 
ciiecking  period  is  one  way  in  which  such  weaknesses  are  made  apparent. 

For  example,  one  installation  required  that  an  existing  control  pulse 
be  also  fed  into  the  new  equipment.  In  order  to  do  this  the  physical 
arrangement  of  video  cables  carrying  this  signal  was  changed  although 
their  logical  function  was  not.  After  this  installation  several  low 
margins  were  fo'ind  which  were  the  result  of  an  unforeseen  change  in 
pulse  timing  caused  by  the  change  in  pulse  routing. 

The  marginal  checking  facilities  are  also  valuable  in  trouble 
location  work  not  related  to  the  routine  preventive  maintenance,  espe- 
cially in  evaluating  the  performance  of  new  circuits  or  ones  that  have 
been  repaired.  In  a typical  case  an  dlectrorAc  switch  utilising  eight 
flip-flops  of  a new  design  was  installed  after  passing  exhaustive  bench 
tests.  In  the  computer  system,  it  was  found  that  the  flip-flops  showed 
low  margins  and  several  failures  of  the  switch  were  reported  within  a 
week.  Improved  flip-flop  circuits  which  gave  wide  margins  were  then 
substituted.  These  have  operated  about  six  months  without  failure. 

5.2  Sudden  Failures 


For  sudden  or  intermittent  failures  a somevrtiat  different 
approach  is  needed.  In  the  c aSo  a sudden  failure  within  the  si'stem, 
it  is  necessary  to  isolate  and  repair  the  circuit  in  order  to  get  the 
system  back  into  operation.  Fortunately  the  procedure  for  doing  this  is 
relatively  strai^tforward  so  little  time  is  lost  on  the  average.  A 
program  is  inserted  which  shows  the  failure.  This  can  be  the  one  that 
was  in  use  at  the  time  the  failure  occurred  or  a simplified  one  designed 
on  the  spot  which  produces  the  same  failure.  With  the  cyclic  program 
control,  it  is  possible  to  quickly  determine  on  which  step  the  alarm 
occurs.  This  control  periodically  restarts  the  program  and  then  stops 
it  after  an  arbitrary  number  of  orders  have  been  executed.  Usually 
an  analysis  of  flip-flop  indicator  light  patterns  for  a few  steps  pre- 
ceding the  alarm  will  show  where  information  is  failing  to  transfer 
properly.  Then  simple  -ignal  tracing  in  the  suspected  circuits  using 
the  test  oscilloscop)e  and  remote  video  probes  will  pinpoint  the  diffi- 
culty. 
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!» .3  Intermittent  yAlluree 


Die  most  troubles<»B6  failures  In  Vhirlwlnd,  as,  I suspect,  in 
ary  cor', are  Interolttent  failure^,  Dsualljr  the  amcunt  of  data 
available  la  hlgbljr  Inadequate  tar  locallilng  the  difficult/  so  one 
la  forced  to  use  cut  and  try  procedures.  The  aeardi  for  an  intermittent 
starts  with  a stud/  of  all  available  reports  on  recent  transient  failuresf 
report  forms  filled  out  b/  usere,  photographs  uaf  the  indicators  and 
controls  taken  foUowlag  unexplainable  arroi.‘S«  and  aoj  obeervations  made 
by  engineers  and  tedmldLans  mhlle  working  on  Ihe  s/stem.  Fros  sudi 
infonnatlon«  a tedinidan  fanllisr  with  the  machine  logic.  In  general, 
can  estimate  what  area  of  the  coiroter  produced  the  failure.  He  then 
insertb  a program  and  tests  the  suspect^  components  or  pansls  lightl/ 
tapping  them  to  see  if  anj  errors  are  introduced.  A oomentar/  short  be- 
tween a control  grid  and  enother  element  in  a gate  tube  is  exao^^le  of 
an  intermittert  failure  which  can  be  located  quits  regularly.  It  gener- 
ally will  cause  an  output  pulse  from  the  tube  even  when  no  ii^t  pulse 
is  supplied.  If  sudi  a failure  were  suspected,  the  program  inserted 
would  be  one  which  supplied  no  input  signals  to  the  tube  but  which  checked 
for  presence  of  output  pulses  ff-ca  it. 

In  carrying  out  cut  and  try  procedures  for  locating  Interaittents, 
the  cyclic  program  control  and  marginal  checking  facilities  may  also  prove 
useful  < IhOTe  was  a recent  Instance  where  the  computer  showed  syvqptomc  of 
an  Intermitteitt  frAlure  which  was  later  tracked  down  bf  neans  of  special 
diagnostic  programs  and  the  use  of  the  ^lie  program  control.  After  the 
trouble  was  located,  it  was  obvious  that  marginal  chocking  would  also  have 
pointed  out  the  defect.  In  this  instance,  the  symptesss  Indicated  that  a 
register  occasionally  was  not  being  cleared  at  the  proper  time.  special 
program  designed  to  emphasise  this  failure  was  inserted.  It  uncovered  the 
fac!t  that  the  clearing  operation  was  correct  but  the  register  was  receiving 
a spurious  read-ln  shortly  after  the  clear  pulse.  This  was  traced  to  an 
Inproperly  terminated  delay  line  which  was  reflecting  a delayed  pulse  with 
sufficient  amplitude  to  cause  the  occasional  read-in.  However,  the  faulty 
delay-line  condition  had  existed  for  some  time.  It  was  discovered  that 
the  routine  action  of  r>?placlng  the  buffer  amplifier  that  fed  the  del^ 
line  was  the  direct  causa  of  the  intermittent  trouble.  It  gave  a somewhat 
hi^er  output  so  the  unwanted  reflection  occasionally  exceeded  the  per- 
missible limit  for  noise  in  that  circuit.  If  marginal  checking  had  been 
performed  on  this  aaqlifler,  the  line  would  have  shown  a vary  lev  margin 
so  the  defect  could  also  have  been  readily  found  by  that  means. 


6.0  SDHMART 


As  a brief  review  of  sy  remarks,  I will  show  some  slides  idii<di 
illustrate  the  nore  significant  points  that  have  been  covered. 
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(SLIDE  1) 

In  the  first  slide  are  listed  the  types  of  failiires  which  have 
shown  up  in  operation  and  icaintenanea  of  the  VfMrlwind  I conqjuterx  gradual 
deterioration,  sudden  failures,  intarnittent  failures,  and  weaknesses  due 
to  new  equipment  installation.  The  intermittent  type  are  aiost  troublesome 
since  the  other  types  can  be  dealt  with  in  a routine  and  stral^ttforwaid 
manner. 

(SLIDE  2) 


T)ie  second  slide  is  a view  of  a part  of  the  oomputer  showing 
the  open  type  of  construction  used.  This  suggests  why  it  in  practical 
and  desirable  to  repair  circuits  in  place  rather  than  to  replace  panels. 
Remote  video  probes  can  bs  placed  on  any  point  in  a circuit  for  viewing 
waveforms  on  a central  test  oscilloscope.  As  I have  pointed  out  this  has 
had  an  influence  on  the  trouble  location  procedures  that  have  been  developed. 

(SLIDE  3) 


The  next  elide  ehows  the  cou^iter  control  center.  The  flexibility 
provided  by  this  relay-rack  type  of  installation  permitted  frequent  altera- 
tion of  the  control  facilitia?  uhlle  trouble  location  techniques  were  being 
worked  out.  Grouped  :.n  this  area  are  the  marginal  checking  controls,  flip- 
flop  indicators,  alarm  ll^ts,  switches  for  controlling  the  computer  opera- 
tion and  inserting  or  altering  its  program,  a congruter  output  display  scope, 
test  oscilloscopes  with  pushbutton  selection  of  many  critical  waveforms  or 
signals  from  remote  video  probes,  and  a master  intercom  station  for  cobbdi- 
nication  with  uuiar  computer  working  areas. 

(SLIDE  h) 


In  maJ.ntenance  procedures,  major  use  is  made  of  the  marginal 
checking  facilities  built  Into  the  Whirlwind  conqjuter.  It  is  used  dally 
In  routine  examinations  of  the  system  for  deteriorating  circuits.  These 
daily  tests  provide  recoros  of  gradual  deterioration  so  most  component 
replacement  can  be  done  during  scheduled  maintenance  periods.  This  slide 
shows  a typical  record  of  deterioration  on  one  line.  The  dated  entries 
are  new  voltage  excursions  set  in  after  the  program  failed  with  the  pre- 
vious excursion.  In  December  1952  the  negative  margin  dropped  to  the 
danger  point  of  12  volts.  Two  tubes  were  replaced  and  the  original  aurgin 
was  restored.  The  marginal  diecklng  equipment  is  also  invaluable  in  eval- 
uating the  performance  of  newly  installed  equipment  as  well  as  in  isolating 
intermittent  failures  that  inadvertently  may  result  idien  installation  or 
^®p5fci2r  verk  is  donsa 

(SLIDE  5) 


Sudden  failures  are  analysed  by  utilising  the  cyclic  program 
control  and  observing  results  on  indicator  li^ts  and  on  the  test  oscill- 
oscope. Intermittent  failures  require  careful  study  of  all  available 
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symptoms  and  a shrewd  estimate  by  an  experienced  operator  of  where  to 
look  for  the  trouble.  This  slide  is  s typical  piictoffraph  of  the  opera= 
ting  console  taken  after  an  alarm.  It  shows  Indicator  light  patterns 
and  switch  sattinga  which  can  be  analyzed  when  tracking  down  failures. 

In  both  of  these  eases  the  con^niter  program  used  is  highly  significant 
but  little  success  has  bean  adiiered  in  deraloping  one  that  is  uniTsrsally 
useful.  Instead  it  has  bean  found  that  relatively  simple  order  sequences 
unicpialy  designed  for  the  problem  at  hand  and  modified  as  test  rasultb 
reqi^e  are  a more  powerful  tool. 

(SZiIDo  6) 


An  adequate  measure  of  the  effectiveness  of  trouble  location 
procaduree  in  Whirlwind  ie  difficult  where  new  installation  is  continually 
being  carried  cut.  On  this  sllde^  however,  are  listed  some  data  that  I 
feel  have^  olgnlflcaxiee . Of  the  time  scheduled  for  useful  cpmpatation 
daring  the  past  year  about  9C  percent  w^  usable.  This  figure  is  based 
on  reports  fnibidtted  by  groups  using  the  computer  rather  than  on  statements 
of  personnel  maintaining  it.  During  that  period  there  has  been  an  average 
of  about  100  man  hours  of  Installation  work  per  week  dene  on  a weekly  basis. 
Twenty  four  hours  per  week  of  preventive  maintenance  is  listed.  About  half 
of  this  is  routine  dally  ohecldng  \^le  the  remainder  is  test  periods  foll> 
owing  installation  work.  The  average  length  of  the  periods  when  the  cooqputer 
has  bean  forced  out  of  operation  diuring  scheduled  computation  work  is  of  the 
order  of  20  minutes. 

Although  this  record  may  be  a tolerable  one  at  present,  continued 
effort  is  being  expended  to  better  it.  Most  needed  Is  a more  powerful  attack 
on  the  problem  of  iutei’BcLtter.t-  f.dlursB. 
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TYPES  OF  FAULTS 

DETERIORATION 
SUDDEN  FAILURES 
INTERMITTENT  FAILURES 
MALADJUSTMENT  IN  NEW  EQUIPMENT 
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6 2 3 


COMPUTER  CONTROL  ROOM 


MARGINAL 

CHECKING 

RECORD 

LINE 

DANGER  POINT 

PROGRAM 

268 

12  VOLTS 

T-2012 

DATE 

EXCURSIONS 

REMARKS 

31  AUG  51 

-31 

+ 40 

8 J A N 52 

-31 

+ 36 

2 APR  52 

-30 

+ 40 

24  SEP  52 

-20 

+ 40 

18  DEC  52 

FTal 

+ 40 

3 JAN  53 

- 30 

+ 40 

TWO  TUBES 
REPLACED 

-54624 


SLIDE  5 

INDICATOR  LIGHT 
FOLLOWING  AN 


< 


PATTERN 

ALARM 


MAINTEiNANCE  EFFECTIVENESS 


SCHEDULED  TIME  USEADLE  90 

INSTALLATION  TIME  100 

PREVENTIVE  MAINTENANCE  TIME  24 


PERCENT 

MAN  HRS/WEEK 

HRS/WEEK 


AVERAGE  UNSCHEDULED  DOWN  TIME 


20  MINUTES 
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